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ABSTRACT 

Constructed-response formats are desired for measuring 
complex and dynamic response processes which require the examinee 
to understand the structures of problems and micro-level 
cognitive tasks. These micro-level tasks and their organized 
structures are usually unobservab.le- This study shows that 
elementary graph theory is useful for organizing these micro- 
level tasks and for exploring their properties and relations. 
Moreover, this approach enables us to better understand macro- 
level performances on test items. Then, an attempt to develop a 
general theory of item construction is described briefly and 
illustrated with the domains of fraction addition problems and 
adult literacy. Psychometric models appropriate for various 
scoring rubrics are discussed. 



Introduction 

Recent developments in cognitive theory suggest that new 
achievement tests must reflect four important aspects of 
performance: The first is to assess the principle of performance 
on a test that is designed tc measure, the second is to measure 
dynamic changes in students' strategies, the third is to evaluate 
the structure or representation of knowledge and cognitive 
skills, and the fourth is to assess the automat icity of 
performance skills (Graser, 1985) . 

These measurement objectives require a new test theory that 
is both qualitative and quantitative in nature. Achievement 
measures must be both descriptive and interpretable in terms of 
the processes that detemine performance. Traditional test 
theories have shown a long history of contributions to American 
• education through supporting norm-referenced and criterion- 
referenced testing. 

Scaling of test scores has been an important goal in these 
types of testing, while individualized information such as 
diagnosis of misconceptions has never been a main concern of 
testing, in these contexts the information objectives for a test 
will depend on the intended use of the test. Standardized test 
scores are useful for admission or selection purposes but such 
scores cannot provide teachers with useful information for 
designing remediation. Formative uses of assessment require new 
techniques, and this chapter will try to introduce one of such 
techniques. 
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Constructed-response formats are desirable for measuring 
complex and dynamic cognitive processes (Bennett, Ward, Rock, & 
LaHart, 1990) while multiple-choice items are suitable for 
measuring static knowledge. Birenbaum and Tatsuoka (1987) 
examined the effect of the response format on the diagnosis of 
examinees • misconceptions and concluded that multiple-choice 
items may not provide appropriate information for identifying 
students' misconceptions. The constructed-response format, on 
the other hand, appears to be more appropriate. This finding 
also confirms the assertion mentioned above by Bennett et al. 
(1990) . 

As for the second objective, several studies on "bug" 
stability suggest that bugs tend to change with "environmental 
challenges" (Ginzburg, 1977) or "impasses" (Brown (: VanLehn, 
1980) . Sleeman and his associates (1989) developed an 
intelligent tutoring system aimed at the diagnosis of bugs and 
their remediation in algebra. However, bug instability made 
diagnosis uncertain and hence remediation could not be directed. 
Tatsuoka, Birenbaum and Arnold (1990) conducted an experimental 
study to test the stability of bugs and also found that 
inconsistent rule application was common anong students who had 
not mastered signed-number arithmetic operation?,. By contrast, 
mastery-level students showed a stable pattern of rule 
application. These studies strongly indicate that the unit of 
diagnosis should be neither erroneous rules nor bugs but somewhat 
larger components such as sources of misconceptions or 



instructionally relevant cognitive components. 

The primary weakness of attempts to diagnose bugs is that 
bugs are tentative solutions for solving the problems when 
students don't have the right skills. 

However, the two identical subtests (32 items each) used in 
the signed-number study, had almost identical true score curves 
for the two parameter-logistic model (Tatsuoka & Tatsuoka, 1991) . 
This means that bugs are unstable but total scores are very 
stable. Therefore, searching for the stable components that are 
cognitively relevant is an important goal for diagnosis and 
remediation. 

The third objective, evaluating the structure or 
representation of cognitive skills, requires response formats 
different from traditional item types. We need items that ask 
• examinees to draw flow charts in which complex relations among 
tasks, subtasks, skills and solution path are expressed 
graphically, or that ask examinees to describe such relations 
verbally. Questions jan be figural response formats in which 
examinees are asked to order the causal relationships among 
several concepts and connect them by a directed graph. 

These demanding measurement objectives apparently require a 
new psychometric theory that can accommodate more complicated 
forms of scoring than just right or wrong item-level responses. 
The correct response to the item is determined by whether or not 
all the cognitive tasks involved in the item can be answered 
correctly. Therefore, the hypothesis in this regard would be 
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that if any of the tasks would be wrong, then there would be a 
high probability that the final answer would also be wrong. 

These item-level responses are called macro-level responses 
and those of the task-level are called micro-level responses. 
This report will address such issues as follows: 

The first section will discuss macro-level analyses versus 
micro-level analyses and will focus on the skills and knowledge 
that each task requires. 

The second section will introduce elementary graph theory as 
a tool to organize various mic. .-level tasks and their directed 



relations. 
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Third, a theory for designing constructed-response items 
will be discussed and will be illustrated with real examples. 
Further, the connection of this deterministic approach to the 
probabilistic models, item Response Theory and Rule space models 
(Tatsuoka, 1983, 1990, will also be explained. These models will 
be demonstrated as a computation device for drawing inferences 
about micro-level performances from the item-level responses. 

Finally, possible scoring rubrics suitable for graded, 
continuous and nominal response models will be addressed. 

Macro- ftnrt Mi ,cra-r.<»vcii Analy gcc 
Making Tp f.r.n res On ITp oh.erv.h 1o Hi cro-r..v.i 
Observahi .e Item-T ^evel ScnrcQ 

Statistical test theories deal mostly with test scores and 
item scores. In this study, these scores are considered to be 
™acro-level information while the underlying cognitive processes 
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are viewed as micro-level information. Here we shall be using a 
much finer level of observable performances than the item level 
or the macro-level. 

Looking into underlying cognitive processes and speculating 
about examinees' solution strategies, which are unobservable, may 
be analogous to the situation that modern physics has come 
through in the history of its development. Exploring the 
properties and relations among micro-level objects such as atoms, 
electrons, neutrons and other elementary particles, has led to 
many phenomenal successes in theorizing about physical phenomena 
at the macro-level such as the relation between the loss and gain 
of heat and temperature. Easley and Tatsuoka (1968) state in 
their book Scientific Thought that "the heat lost or gained by a • 
sample of any non-atomic substance not undergoing a change of 
state is jointly proportional to '.he number of atoms in the 
sample and to the temperature change. This strongly suggests 
that both heat and temperature are intimately related to some 
property of atoms." Heat and temperature relate to molecular 
motion and the relation can be expressed by mathematical 
equations involving molecular velocities. 

This finding suggests that, analogously, it might be useful 
to explore the properties and relations among micro-level and 
invisible taska, and to predict their outcomes. These are 
observable as responses to test items. The approach mentioned 
above is not new in scientific research. In this instance, our 
aim is to explore a method that can, scientifically, explain 
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macro-level phenomena — in our context item-level or test-level 
achievement — derived from micro-level tasks. The method should 
be generalizable from specific relations in a specific domain to 
general relations in general domains. in order to accomplish our 
goal, elementary graph theory is ur^ed. 
Identification of Prime Subtasks or Attributes 

The development of an intelligent tutoring system or 
cognitive error diagnostic system, involves a painstaking and 
detailed task analysis in whicn goals, subgoals and various 
solution paths are identified in a procedural network (or a flow 
chart) . This process of uncovering all possible combinations of 
subtasks at the micro-level is essential for making a tutoring 
system perform the role of the master teachers, although the 
current state of research in expert systems only partially 
c'chieves this goal. According to Chipman, Davis and Shafto 
(1986), many studies have shown the tremendous effectiveness of 
individual tutoring by master teachers. 

It is very important that analysis of students' performances 
on a test be similar to various levels of analyses done by human 
teachers while individual tutoring is given. Although the 
context of this discussion is task analysis, the methodology to 
be introduced can be applied in more general contexts such as 
skill analysis, job analysis or content analysis. 

Identifying subcomponents of tasks in a given problem- 
solving domain and abstracting their attributes is still an art. 
It is also necessary that the process be made automatic and 
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objective. However, we here assume that the tasks are already 
divided into components (subtasks) and that any task in the 
domain can be expressed by a combination of cognitively relevant 
prime subcomponents. Let us denote these by k^,...,A^^ 
and call them a set of attributes. 



Insert Figure 1 about here 



Determination of Direct Relations Between Attributes 

Graph theory is a branch of mathematics that has been widely 
used in connection with tree diagrams consisting of nodes and 
arcs. In practical applications of graph theory, nodes represent 
objects of substantive interest and arcs show the existence of 
some relationship between two objects. In the task-analysis 
•setting, the objects correspond to attributes. Definition of a 
direct relation is determined by the researcher using graph 
theory, on the basis of the purpose of his/her study. 

For instance, A,^ ■* if A,^ is an immediate prerequisite of 
\ (Sato, 1990), or A,^ - Ai if A,^ is easier than A^ (Wise, 1981). 
These direct relations are rather logical but there are also 
studies using sampling statistics such as proximity of two 
objects (Hubert, 1974) or dominance relations (Takeya, 1981). 
(See M. Tatsuoka (1986) for a review of various applications of 
graph theory in educational and behavioral research.) 

The direct relations defined above can be represented by a 
matrix called the adjacency matrix A = (a^i) where 

M 



f a,^^ = 1 if a direct relation exists from A,^ to 

I a,^i = 0 otherwise 
If a direct relation exists from A,^ to A^ and also from k^ to A^^, 
then A,^ and A^ are said to be equivalent. In this case, the 
elements a|^^ and a^,^ of the adjacency matrix are both one. 

There are many ways to define a direct relationship between 

two attributes, but we will use a "prerequisite" relation in this 

paper. One of the open-ended questions shown in Bennett et al . 

(1990) will be used as an example to illustrate various new 

terminologies and concepts in this study. 

Item 2: How many minutes will it take to fill a 2,000- 
cubic-centimeter tank if water flows in at the 
rate of 20 cubic-centimeters per minute and is 
pumped out at the rate of 4 cubic-centimeter per . 
minute? 

This problem is a two-goal problem and the main canonical 
solution is that: 

1. Net filling rate = 20 cc per minute - 4 cc per minute 

2. Net filling rate = 16 cc per minute 

3. Time to fill tank = 2000 cc/16 cc per minute 

4. Time to fill tank = 125 minute. 

Let us define attributes involved in this problem: 

: First goal is to find the net filling rate 

Ag : Compute the rate 

A3 : Second goal is to find the time to fill the tank 

A4 : Compute the time. 

In this example, A^ is a prerequisite of Ag, Ag is a prerequisite 

of A3, and A3 is a prerequisite of A^. This relation can be 

written by a chain, A^ -> Aj -> A3 -> A^. This chain can be 

expressed by an adjacency matrix whose cells are 
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9 



^12 ~ ~ - 1» Others are zeros. 



'23 



^34 



Adjancency matrix A = 



1 
0 
0 
0 



0 
1 
0 
0 



0 
0 

1 

0 



"1 
As 
A3 
A4 



This adjacency matrix A is obtained from the relationships 
among the attributes which are required for solving item 1. The 
prerequisite relations expressed in the adjacency matrix A in 
this example may change if we add new items. For instance, if a 
new item — that requires only the attributes A3 and A^ to reach 
the solution — is added to the item pool consisting of only item 
1, then A^ may not be considered as the prerequisite of A3 any 
more. The prerequisite relation, in practice, must be determined 
■by a task analysis of a domain and usually it is independent of 
items that are in an item pool. 

Reachability Ma trix; Representation of All the Relations. Both 
Direct and Indirect Warfield ( 1973a, b) developed a method called 
"interactive structural modeling" in the context of switching 
theory . 

By his method, the adjacency matrix shown above indicates 
that there are direct relations from A^ to Ag, from Ag to A3 and 
from A3 to A^ but no direct relations other than among these 
three arcs. However, a directed graph (or digraph) consisting of 
A^, Aj, A3, and A^ shows that there is an indirect relation from 
Ai to A3, from Aj to A^, and A^ to A^. 
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Warfield showed that we can get a reachability matrix by 
multiplying the matrix A + I — the sum of the adjacency matrix A 
and the identity matrix I — by itself n times in terms of 
Boolean Algebra operations. The reachability matrix indicates 
that reachability is at most n steps (A,^ to A^) , whereas the 
adjacency matrix contains reachability in exactly one step (A,^ to 
A^) [a node is reachable from itself in zero steps]. The 

reachability matrix of the example in the previous section is 
given below: 

R = (A + I)^ = (A + I)^ = (A + 1)5 = 

A. 



R = 



Ai A2 A3 



1111 
0 111 
0 0 11 
0 0 0 1 



where the definition of Boolean operations is as follows: 

1+1=1, 1+0=0+1 =1, 0+0=0 for addition and 
1x1=1, 0x1=1x0=0, 0x0=0 for multiplication. 
The reachability matrix indicates that all attributes are 
related directly or indirectly. From the chain above, it is 
obvious that although A^. and A^^^ relate directly A,^ and A,^+2 
relate indirectly. 

This form of digraph representation of attributes can be 
applied to either evaluation of instructional sequences, 
curriculum evaluation, and documentation analysis and has proved 
to be very useful (Sato, 1990) . Moreover, reachability matrix 
can provide us with information about cognitive structures of 
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attributes. However, appl' cation to assessment analysis requires 
extension of the original method introduced by War field. 

A Theory of Item Design Appropriate For 
The Constructed-Response Format 
An Incidence Matrix In Assessment Analysis 

The adjacency matrix (a,^^) is a square matrix of order 
K X K, where K is the number of attributes and a,^^ represents the 
existence or absence of a direct directed relation from A;^ to A^. 

Let us consider a special case. 

When the adjacency matrix A is a null matrix, hence A + I is 
the identity matrix of the order k — there is no direct relation 
among the attributes. Let Q be a set {A^, A2,...,A|^} and L be 
the set of all subsets of fi, 

L= [{A^}, {A2} , . . . , {A^,A2} , {A^^Aj} , . . . , {A^^Ag •••,\}f{}]f 
then L is called a lattice in which the number of elements in L 
is 2^. 

In this case, we should be able to construct an item pool of 
2^ items in such a manner that each item inyolyes only one 

element of L. There is a row for each attribute and a column for 
each item, and the element of 1 in (k,j)-cell indicates that item 
j inyolyes attribute A;^ while 0 indicates that item j does not 

inyolye A,^. Then this matrix of order K x 2*^ — or K x n for 

short — is called an incidence matrix, Q = (q^j) , k=l,...K & 

j—1 , . • • n. 

For example, in the matrix Q below, k + 1 th column (item 
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k + 1) has the vector of (110 ... 0) which corresponds to the 
k + 1 th set, {A,, Aj) in L. 



Q(kxn) = 



il i2 

■ 1 0 

0 1 

0 0 

• • 

0 0 



ik i(k+l) i(k+2) . . . i(2,^-l) i(2'') 



0 1 
0 1 
0 0 



1 
0 
1 



1 
1 
1 



0 
0 
0 



A, 



0 J A. 



However, if K becomes large, say K=20, then the number of 
items in the item pool becomes astronomically large, 

20 

2 =1,048,576. In practice, it might be very difficult to 
develop a pool of constructed response items so that each item 
requires only one independent attribute. Constructed response 
items are usually designed to measure such functions as cognitive 
processes, organization of knowledge and cognitive skills, and 
theory changes required in siolving a problem. These complex 
mental activities require an understanding of all the 
relationships which exist in the elements of Q. Some attributes 
are connected by a direct relation while others are isolated. 

In general, the manner in which the attributes in 0 
interrelate, one with another, bear a closer resemblance to the 
arc/node tree configuration than they do to the unidimensional 
chain shown in the previous section. 

Suppose we modify the original water-f illing-a-tank problem 
to make four new items (beyond our original item 1 - page 8), 
which include the original attributes. 
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Item 2 



Item 3 



Item 4 



Item 5 



What is the net filling rate of water if water 
flows in at the rate of 50 cc/min and out at the 
rate of 35 cc/min ? 

What is the net filling rate of water if water 
flows in at the rate of h cc/min and out at the 
rate of d cc/min? 

How many minutes will it take to fill a 1,000- 
cubic-centim&ters tank if water flows in at the 
rate of 50 cubic-centimeters per minutes? 

How many minutes will it take to fill an x cubic- 
centimeters water tank if water flows in at the 
rate of y cubic-centimeters per minutes? 



The incidence matrix Q for the five items will be: 

il i2 i3 i4 i5 

A, 



Q(4x5) = 



1110 0 
110 0 0 
10 0 11 
10 0 10 



The prerequisite relations among the four attributes are 



changed from the "totally ordered" chain, A, -> A, 



•> A, -> A, 



to the partially ordered relation as stated below. That is. A, 
is a prerequisite of Aj, A3 is a prerequisite of A^, but A2 is 
not a prerequisite of either A3 or A^. The relationship among 
the attributes is no longer a totally-ordered chain but two 
totally-ordered chains, A^ -> Ag and A3 -> A^. 

Tatsuoka (1991) introduced the inclusion order among the row 
vectors of an incidence matrix and showed that a set of the row 
vectors becomes Boolean Algebra with respect to Boolean addition 
and multiplication. In this Boolean algebra, the prerequisite 
relation of two attributes becomes equivalent to the inclusion 
order between two row vectors — that is, the row vectors A, and 



Aj include the row vectors and A^, respectively, in the 
Q(4 X 5) matrix above. 

There is an interesting relationship between an incidence 
matrix Q(k x n) and the reachability matrix R(k x k) . A pairwise 
comparison over all the combinations of the row vectors of 
Q(k X n) matrix with respect to the inclusion order will yield 
the reachability matrix R(k x k) in which all the relations 
logically existing among the k attributes, both direct or 
indirect, are expressed. This property is very useful for 
examining the quality and cognitive structures of an item 
pool. 

The adjacency and reachability matrices of the GRE items 
given earlier are given below: 



However, the reachability matrix of the case given in Q(kxn) 
in which k attributes have no relations will b'^ the identity 
matrix of the order k. This result can be easily confirmed by 
examining the inclusion relation of all pairs of the row vectors 
of the matrix Q(k x n) . 

Connection of our Deterministic Approach to Probabilitv Theories 
Tatsuoka and Tatsuo'-a (1987) introduced the slippage random 
variable Sj, which is assumed to be independent across the items, 
as follows: 



'O 1 0 0 

0 0 0 0 

0 0 0 1 

,0 0 0 0 



1 1 0 o' 



A(4x4) = 



R(4x4) = 



0 10 0 
0 0 11 

.0 0 0 1, 



If Sj - 1, then Xj. = 1 - R, and if Sj = 0, then Xj = Rj. 
or, equivalently, Sj = [ Xj - R. j . 

A set {X„} forms a cluster around R — (where X„ is an item 
response pattern that is generated by adding different numbers of 
slips to the ideal item pattern R) . The Tatsuokas showed that 
the total number of slippage s in these "fuzzy" item patterns 
follows a compound binomial distribution with the slippage 
probabilities unique to each item. They called this distribution 
the "bug distribution." 

However, it is also the conditional distribution of s given 
R, where R is a state of knowledge and capabilities. This is 
called a state distribution for short, once a distribution is 
determined for each state of knowledge and capabilities, then 
Bayes' decision rule for minimum errors can be applied to 
classify any student's response patterns into one of these 
predetermined states of knowledge and capabilities (Tatsuoka & 
Tatsuoka, 1987). 

The notion of classification has an important implication 
for education. Given a response pattern, we want to determine 
the state to which the students' misconception is the closest and 
we want to answer the question: "What misconception, leading to 
what incorrect rule of operation, did this subject most likely 
have?" or "What is the probability that the subject's observed 
responses have been drawn from each of the predetermined states?" 
This is error diagnosis. 

For Bayes' decision rule for minimum errors, the 
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classification boundary of two groups of "fuzzy" response 
patterns becomes the linear discriminant function when the state 
distributions are a multivariate normal and their covariance 
matrices are approximately equal. Kim (1990) examined the effect 
of violation of the normality requirement, and found that the 
linear discriminant function is robust against this violation. 
Kim further compared the classification results using the linear 
discriminant functions and K nearest neighbors method, which is a 
non-parametric approach, and found that the linear discriminant 
functions are better. However, the classification in the n- 
dimensional space with many predetermined groups (as many as 50 
or 100 states) is not practical. 

Tatsuoka (1983, 1985, 1990) proposed a model (called 'rule 
space') that is capable of diagnosing cognitive errors. Rule 
space uses item response functions where the probability of 
correct response to item j is modeled as a function of the 
student's "proficiency", (which is denoted by 6) as Pj(6), and 
that Qj( 6)=1-Pj ( 6) . Since the rule space model maps all possible 
item response patterns into ordered pairs of (6,0 and where C is 
an index measuring atypicality of response patterns (a projection 
operator by a mathematical term) , all the error groups will also 
be mapped into this Cartesian Product space. The mapping is one- 
to-one at almost everywhere if IRT functions are monotone 
increasing (Tatsuoka, 1985; Dibello & Baillie, 1991). 

Figure 3 illustrates the rule space configuration. 
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Insert Figure 3 about here 

Rule space can be regarded as a technique for reducing the 
dimensionality of the classification space. Furthermore, since 
the clusters of "fuzzy" response patterns that are mapped into 
the two dimensional space follow approximately bivariate normal 
distributions (represented by the ellipses shown in Figure 3), 
Bayes' decision rules can be applied to classify a point in the 
space into which one of the ellipses shown in Figure 3), (M. 
Tatsuoka & K Tatsuoka, 1989; Tatsuoka, 1990). 

Kim also compared the classification results using rule 
space with Bayes' classifiers ~ the discriminant function 
approach — and the non-parametric K-nearest neighbors method. 
He found that the rule space approach was efficient in terms of 
CPU time, and that the classification errors were as small as 
those created by the other two methods. 

Moreover, states located in the two extreme regions of the 6 
scale, tended to have singular within-groups covariance matrices 
in the n-dimensional space; hence, classification using 
discriminant functions could not be carried out for such cases. 
The rule space classification, on the other hand, was always 
obtainable and reasonably reliable. 

We assumed the states for classification groups were pre- 
determined. However, determination of the universal set of 
knowledge states is a complicated task and it requires a 
mathematical tool. Boolean algebra, to cope with the problem of 
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combinatorial explosion (Tatsuoka, 1991) . 

We utilized a deterministic logical analysis to narrow down 
the fuzzy region of classification as much as possible to the 
extent that we would not lose the interpretability of 
misconceptions and errors. Then the probability notion, used to 
explain such uncertainties as instability of human performances 
on items, was used to express perturbations. 

Correspondence Between the Two Spaces. Attribute Responses and 
Item Responses 

Tatsuoka (1991), Varadi & Tatsuoka (1989) introduced a 
"Boolean descriptive function" f to establish a relationship 
between the attribute responses and item responses. 

For example, in the matrix Q(4 x 5) , a subject who can not 
do A, but can do Aj, A3, and A^, will have the score of 1 for 
those items that do not involve A^ and the score of 0 for those 
that do involve A,. Thus, the attribute pattern (0 111) 
corresponds to the observable item pattern (0 0 0 1 1) . 

By making the same kinds of hypothesis on the different 
elements of L and applying these hypotheses to the row vectors of 
the incidence matrix Q, we can derive the item patterns that are 
logically possible for a given Q matrix. These item patterns are 
called ideal item patterns (denoted by Ys) . 

Generally speaking, the relationship between the two spaces, 
the attribute and item spaces is not straightforward as the 
example of Q(4 x 5) . This is because partial order relations 
among the attributes almost always exist and a given item pool 



often does not include the universal set of items which involve 



all possible combinations of attributes. 

A case when there is no relation among the attributes 

Suppose there are four attributes in a domain of testing, 
and that the universal set of items 2^ ere constructed, then 
incidence matrix of 2^ items is given below: 



Q(4 X 16) = 



1111111 
1234567890123456 

0100011100011101 
0010010011011011 
0001001010110111 
0000100101101111 



"1 

A3 

A. 



An hypothesis that states "this subject cannot do A^ but can 
■do A,,..A^.,, A^^,,..A,^ correctly" corresponds to the attribute 
pattern (i ...1 0 I...I). Le;t us denote this attribute pattern 
by th-^n produces the itera pattern where Xj = 1 if item 
j :loes not involve A^, and Xj = 0 if item j involves Ai. This 
izion is defined as a Boolean descriptive function. 
Sixteen possible attribute patterns and the images of f (I6 
ideal item patterns) , are summarized in Table l below. 

Insert Table 1 about here 
For instance, attribute response pattern 1 0 indicates that 
a subject cannot do A^ and A3 correctly but can do A2 and A^. 
Then from the incidence matrix Q(4xl6) shown above, we see that 
the scores of items 2,4,6,7,8,9 11,12,13,14,16 must become zero 
while the scores of 1,3,5,10 must be 1. 
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Table 1 irdicates that any responses to the 16 items can be 
classified into one of the 16 predetermined groups. They are the 
universal set of knowledge and capability states that are derived 
from the incidence matrix Q(4 x 16) by applying the properties of 
Boolean algebra. In other words, the 16 ideal item patterns 
exhaust all the possible patterns logically compatible with the 
constraints imposed by the incidence matrix Q(4 x 16). By 
examining and comparing a subject's responses with these 16 ideal 
item patterns, one can infer the subject's performances on the 
unobservable attributes. As long as these attributes represent 
the true task analysis, any response patterns of the above 16 
items, which differ from the 16 ideal item patterns, are regarded 
as fuzzy patterns or perturbations resulting from some lapses or 
slips on one or more items, reflecting random errors. 
A Case When There Are Prerequisite Relations Among the Attributes 

So far we have not assumed any relations among the four 
attributes in Table 1. It is often the case that some attributes 
are directly related one to another. Suppose A-, is a 

prerequisite of A2, A2 is a prerequisite of A3 and A^ is also a 

prerequisite of A^. 

Insert Figure 2 about here 
If we assume that a subject cannot do A-| correctly, then Ag 
and A3 cannot be correct because they require knowledge of A^ as 

a prerequisite. Therefore, the attribute patterns 3, 4, 5, 9, 
10, 11, and 15 in Table 1 become (0 0 0 0) which is pattern 1. 



By an argument similar to the above paragraph, "cannot do " 

implies "cannot do A3". In this case the attribute patterns 2 

and 7, and the patterns 8 and 14 are respectively no longer 
distinguishable. Table 2 summarizes the implication of the 
relations assumed above among the four attribute set. 

Insert Table 2 about here 

The number of attribute patterns has been reduced from 16 to 
7. The item patterns associated with these seven attribute 
patterns are given in the right-hand column, in which each 
pattern still has 16 elements. It should be noted that we do not 
need 16 items to distinguish seven attribute patterns. Items 2, 
3, 4, 5, 10, and 11 are sufficient to provide the different ideal 
item patterns, (000000), (1000000), (100100), 
(110 110), (110000), (111000), (111111), which 
are obtained from the second through fifth columns, and the 10th 
and 11th columns of the ideal item patterns in Table 2. 

The seven reduced attribute paterns given in Table 2 can be 
considered as a matrix of the order 7x4. The four column 
vectors, which associate with attributes, A-j, A2, A3 and A4 

satisfy the partial order defined by the inclusion relation. 
Expressing the inclusion relationships among the four attributes 
— Ai (column 1), A2 (column 2), A3 (column 3) and A4 (column 

4) — in a matrix, results in the following reachability matrix 
R: 
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fl 1 1 l' 

R = 



fl 1 1 l) 

0 110 

0 0 10 

0 0 0 i; 



It is easy to verify that R can be derived from the 
adjacency matrix of A obtained from the prerequisite relations 
among the four attributes; A, -> Aj -> A3 and A^ -> A^. 

An approach to design constructed-response items for a diagno stic 
test. 

Notwithstanding the above, it is sometimes impossible to 
construct items like 2,3,4, and 5 which involve only one 
attribute per item. This is especially true when we are dealing 
with constructed-response items, we have to measure much more 
complicated processes such as organization of knowledge and 
cognitive tasks. In these cases, it is natural to assume that 
each item will involve several attributes. By examining Table 
2, one can find several sets of items for which the seven 
attribute patterns produce exactly the same seven ideal item 
patterns as those in Table 2. 

For example, they are a set, (2,3,4,5,10,11), or 
{2,3,4,5,13,11}. These two sets of items are just examples which 
are quickly obtained from Table 2. There are 128 different sets 
of items which produce the seven ideal item patterns when the 
seven attribute patterns in Table 2 are applied. This means that 
there are many possibilities for selecting an appropriate set of 
six items so as to maximize diagnostic capability of a test. The 
common condition for selection of these sets of items can be 

ERIC 21^ 



23 

generalized by the use of Boolean algebra, but detailed 
discussion will not be '^-iven in this paper. 

This simple example implies that this systematic item 
construction method enables us to measure unobservable underlying 
cognitive processes via observable item response patterns. 
However, if the items are constructed without taking these 
requirements into account, then instructionally useful feedback 
or cognitive error diagnoses may not be always obtainable. 
Explanation with GRE math items 

The five items associated with GRE water filling problem are 
given in the earlier section. The incidence matrix Q(4 x 5) 
produces nine ideal item patterns and attribute patterns by using 
BUGLIB program (Varadi & Tatsuoka, 1989). Table 3 summarizes 
them. 

Insert Table 3 about here 

The prerequisite relations, -> and A3 ~> A^ imply some 
constraints on attribute patterns: the attribute pattern, (0 1) 
for A,, Aj and A3, A^ cannot exist logically. A close 
examination of Table 1 reveals thac the constraints result in 
nine distinguishable attribute patterns. They are: 3,5,10 result 
in 1 that is (0000); 8 to 2 that is (1000); 9 to 4, (0010); 13 to 
6, (1100); 15 to 11, (0011) and the remaining patterns 7, (1010); 
12, (1110); 14, (1011) and 16 (1111). These attribute patterns 
are identical to the patterns given in Table 3. 

It can be easily verified that the reachability matrix given 
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in earlier section (p. 13) is the same as the matrix which is 

obtained by examining the inclusion relationships among all 

combinations of the four column vectors of the attribute patterns 

in Table 3. This means that all possible knowledge states, 

obtainable from the four attributes with the structure 

represented by R can be used for diagnosing a student's errors. 

The five GRE items are good items as far as a researcher's 

interest is to measure and diagnose the nine states of knowledge 

and capabilities listed in Table 3. 

Illustration With Real Examples 

Example I; A Case of Discrete Attributes In Fraction Addition 
Problems 

Birenbaum & Shaw (1985) used Guttman's facet analysis 
.technique (Guttman, et.al . 1991) to identify eight task-content 
facets for solving fraction addition problems. There were six 
operation facets that described the numbers used in the problems 
and two facets dealing with the results. Then, a task 
specification chart was created based on a design which combined 
the content facets with the procedural steps. Figure 4 shows the 
task specification chart. 

Insert Figure 4 about here 
The task specification chart describes two strategies to 
solve the problems, methods A and B. Those examinees who use 
Method A convert a mixed number (a b/c) into a simple fraction, 
(ac+b)/c, similarly, the users of method B separate the whole 
number part from the fraction part and then add the two parts 

31 
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independently, in these cases, it is clear that when the numbers 
become larger in a fraction addition problem, then Method A 
obviously requires computational skills to get the correct 
answer. Method B, on the other hand, requires a deeper 
understanding of the number system. 

Sets of attributes for the two methods are selected from the 
task specification chart in Figure 4 as follows: 



Problem: a b/c + o/-f 


Method A 


Method B 


Ai 


Convert (a b/c) to (ac+b)/c 


used 


Not used 




convert (d e/f) to (df+e)/f 


used 


Not used 


A3 


Divide fraction by a common factor 


used 


used 


\ 


Find the common denominator of c & f 


used 


used 




Make equivalent fractions 


used 


used 




Add numerators 


used 


used 


A7 


Divide numerator by denominator 


used 


used 


As 


Don't forget the whole number part 


used 


used 


Bi 


Separate a & d and b/c & e/f 


Not used 


used 


B2 


Add the whole numbers including 0 


Not used 


used 



The two methods share all of the attributes in common, 
except for B^ and B2, A^ and A2. The incidence matrices for the 

ten items in Birenbaum and Shaw (1985), for Methods A and B, are 
given in Table 4. 

Insert Table 4 about here 
A computer program written by Varadi and Tatsuoka (BUGLIB, 
1990) produces a list of all the possible "can/cannot" 
combinations of attributes, otherwise known as the universal set 
of attribute response patterns. 

00 
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For Method A, 13 attribute patterns are obtained. The 
attribute patterns and their corresponding ideal item patterns 
are given in Table 5 where the attributes are denoted by the 
numbers 1 through 8 for through Ag, and 9 and 10 for and 
Bg, respectively. For instance, the second state, 2, has the 
attribute pattern 11111110 and the ideal item pattern is 
represented by 111100010. 

Insert Table 5 about here 

It is interesting to note that there is no state including 
"cannot do an item that involves both of the attributes, A^ and 
Aj, but can do items that involve either A^ or Aj alone" in the 
list given in Table 5. If one would like to diagnose such a 
compound state, then a new attribute should be added to the list. 

Another interesting result is that Ag cannot be separated 
from A^ as long as we use only these ten items. In other words, 
the rows for A^ and Ag in the incidence matrix for Method A are 
identical. Needless to say, Shaw and Tatsuoka (1983) found many 
different errors that originated in attribute A5, — making 
equivalent fractions — and they must be diagnosed for 
remediation (Bunderson & Ohlsen, 1983). In order to separate A5 
from A^;, we must add a new item which involves A^ but not A5, 
thereby making Row A5 different from Row A^^. 

Beyond asking the original "equivalent fraction" question, 
we now add an item to the existing item pool, which asks, "What 
is the common denominator of 2/5 and 1/7?" This is a way to test 
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the skill for getting common denominators correctly and also 
distinguishes the separate skill required for making equivalent 
fractions. However, since the solutions to each of these 
questions a are so closely related and inter-dependent, it may 
not be possible to separate measure the examinees ' skills in 
t.?rms of each function. 

If an examinee answers this item correctly but gets a wrong 
answer for items involving addition, such as 2/5 + 1/7, then it 
is more likely that the examinee has the skill for getting 
correct common denominators but not the skill for making 
equivalent fractions correctly. 

Thirteen knowledge and capability states are identified from 
the incidence matrix for Method B, and they are also summarized 
in Table 5. Some ideal item response patterns can be found in 
the lists for both Methods A and B. This means that for some 
cases we cannot diagnose a student's underlying strategy for 
solving these ten items. Our attribute list cannot distinguish 
whether a student converts a mixed number (a b/c) to an improper 
fraction, or separates the whole number part from the fraction 
part. If we can see the student's scratch paper and can examine 
the numerators prior to addition, then we can find which method 
the student used. There are two solutions to this problem. One 
is to use a computer for testing so that crucial steps during 
problem solving activities can be coded. The second is to add 
new items so that these three attributes, A,, and B, can be 
separated in the incidence matrix for Method B. 

er|c 



28 

Example 2; The Case of Continuous and Hierarchically Related 
Attributes in The Adult Literacy Domain 

Kirsch and Mosenthal (1990) haye deyeloped a cognitiye model 

which underlies the performance of young adults on the so-called 

document literacy tasks. They identified three categories of 

variables which predict the difficulties of items with a multiple 

R of .94. 

Three categories of variables are defined: 

. "Document" variables (based on the structure and 
complexity of the document) 

. "Task" variables (based on the structural relation betwsen 
the document and the accompanying question or directive) 

. "Process" variables (based on strategies used to relate 
information in the question or directive to information in 
the documents" (Kirsch and Mosenthal, 1990, p. 5). 

The "Document" variables comprise six specific variables 
including the number of organizing categories in the document, 
the number of embedded organizing categories in the document and 
the number of specifics. These three variables are considered in 
our incidence matrix as the attributes for "Document" variables. 

The "Task" variables are determined on the basis of the 
structural relations between a question and the document that it 
refers to. The larger the number of units of information 
required to complete a task, the more difficult the task. Four 
attributes are picked up from this variable group. 

The "Process" variables developed through Kirsch and 
Mosenthal 's regression analysis showed that variables in the 
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category of "Process" variables influenced the item difficulties 
to a large extent. One of the variables in this category is the 
degree of correspondence, which is defined as the degree to which 
the information given in the question or directive matches the 
corresponding information in the document. 

The next variable represents the type of information which 
has to be developed to locate, identify, generate, or provide the 
requested information based on one or more nodes from a document 
hiererchy. Five hierarchically related attributes are determined 
from this variable group. 

The last variables are Plausibility of Distractors, which 
measure the ability to identify the extent to which information 
in the document matches features in a question's given and 
requested information. 

A total of 22 attributes are selected to characterize the 61 
items, since the attributes in each variable group are totally 
ordered, i.e., A, -> Aj -> A3 -> A^ -> A5, the number of possible 
combinations of "can/cannot" attributes is drastically reduced 
(Tatsuoka, 1991). One-hundred fifty-seven possible attribute 
response patterns were derived by the BUGLIB program and hence 
157 ideal item response patterns are produced. As was explained 
in the earlier section, these 157 ideal item response patterns 
correspond to the 157 state distributions that are multivariate 
normal. These states are used for classifying an individual 
examinee's response pattern. A sample of ten states with their 
corresponding attribute response patterns are shown in 
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Table 6 as examples. 

Insert Table 6 about here 

As can be seen in Table 6, several subsets of attributes are 
totally ordered and the elements of the subset form a chain. 
Further 1500 subjects were classified into one of the 157 
misconception states by a computer program entitled RULESPACE 
(Tatsuoka, Baillie, Sheehan, 1991). The number of subjects who 
were classified into one of these ten states are — 157 subjects 
in State No.l, 46 in No. 4, 120 in No. 11, 81 in No. 12, 37 in 
No. 14, 68 in No. 50, 12 in No. 32, 27 in No. 102, 11 in No. 138 
and 4 in No. 156. 

While the interpretation of misconceptions for these results 
•is described in detail elsewhere (Sheehan, Tatsuoka & Lewis, 
1991), State No. 11 (into which the largest number of subjects 
were classified) will be described here. 

"Cannot attributes A^g and A^," relate directly from A^g to 
A19. Therefore, as represented in Table 6, the statement can be 
made that, "a subject classified in this state cannot do A^g, and 
hence cannot, by default, do A,,." Thus, the prescription for 
these subjects' errors is likely to be that they make .^.istakes 
when items have the following specific feature: 

. . . . Distractors appear both within an organizing category 
and across organizing categories, because different 
organizing categories list the same specifics but with 
different attributes" (Kirsch and Mosenthal, 1990, p. 30). 

ERIC 3 7 



31 

Psychometric Theories Appropriate For 
A Constructed Response Format 
An incidence matrix suggests various scoring formulas for 
the items. 

First, the binary scores of right or wrong answers can be 
obtained from the condition that - if a subject can perform all 
the attributes involved in an item correctly, then the subject 
will get a score of one' on that item; otherwise the subject will 
get a score of zero. With this scoring formula, the simple 
logistic models (Lord & Novick, 1968) for binary responses can be 
used for estimating the scaling variable G. 

Second, partial credit scores or graded response scores can 
be obtained from the incidence matrix if performance dependent on 
the attributes is observable and can be measured directly. This 
condition permits applicability of Masters' partial credit models 
(Masters, 1982) or Samejima's General Graded response models 
(Samejima, 1988) to data. 

As far as error diagnoses are concerned, simple binary 
response models always work even when performances on the 
attributes cannot be measured directly and are not observable. 
However, computer scoring (Bennett, Rock, Braun, Frye, Spohrer, 
and Soloway, 1990) , or scoring by human raters or teachers can 
assign graded scores to the items. For e. :....ple, the number of 
correctly processed attributes for each item could be a graded 
score . 

Muraki (1991) wrote a computer program for his modified 
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version of Samejima's original graded response model (Samejima, 
1969). Muraki's program can be used for Samejima's model itself 
also. 

Third, a teacher may assign different weights to the 
attributes and give a student a score corresponding to the 
percentage of correct answers achieved, depending on how well the 
student performed on the attributes. Thus, the final score for 
the item becomes a continuous variable. Then Samejima's (1976, 
1988) General Continuous IRT model can be used to estimate the 
ability parameter 8. If the response time for each item is 
available, then her Multidimensional Continuous model can be 
applied to such data sets. 

Fourth, if a teacher is interested in particular 
combinations of attributes and assigns scores to nominal 
categories, say 1 = {can do and A3}, 2 = {can do and A2} 
and 3 = {can do Aj, A3 and A^},.. so on, then Bock's (1972) 
Polychotomous model can be utilized for getting G. 



A wide variety of item Response Theory models accommodating 
binary scores, graded, polychotomous, and continuous responses 
have been developed in the past two decades. These models are 
built upon a hypothetical ability variable 6. We are not against 
the use of global item scores and total scores ~ e.g., the total 
score is a sufficient statistic for 6 in the Rasch Model ~ but 
it is necessary to investigate micro-level variables such as 
cognitive skills and knowledge and their structural relationships 
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in order to develop a pool of ••good" constructed- response items. 
The systematic item construction method enables us to measure 
unobservable underlying cognitive processes via observable item 
response patterns. 

This study introduces an approach for organizing a couple of 
dozen such micro-level variables and for investigating their 
systematic interrelationships. The approach utilizes 
deterministic theories, graph theory and Boolean algebra. When 
most micro-level variables are not easy to measure directly, an 
inference must be made from the observable macro-level measures. 
An incidence matrix for characterizing the underlying 
relationships among micro-level variables is the first step 
toward achieving our goal. Then a Boolean algebra that is 
formulated on a set of sets of attributes, or a set of all 
possible item response patterns obtainable from the incidence 
matrix, enables us to establish relationships between two worlds: 
attribute space and item space (Tatsuoka, 1991). 

A theory of item construction is introduced in this paper 
in conjunction with Tatsuoka^s Boolean algebra work (1991). if a 
subset of attributes has a connected, directed relation and forms 
a chain, then the number of combinations of "can/cannot" 
attributes will be reduced dramatically. Thus, it will become 
easier for us to construct a pool of items by which a particular 
group of misconceptions of concern can be diagnosed with a 
minimum classification errors. 

One of the advantages of rule space model (Tatsuoka, 1983, 
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1990) is that the model relates a scaled ability parameter G to 
misconception states. For a given misconception state, which is 
error, one can always identify the particular types of errors 
which relate to ability level 8. If the centroid of the state is 
located in the upper part of the rule space, then one can 
conclude that this type of error is rare. If the centroid lies 
on the 6 axis, then this error type is observed very frequently. 

Although Rule space was developed in the context of binary 
IRT models, the concept and mathematics are general enough to be 
extended for use in more complicated IRT models. Further work to 
extend the rule space concept to accommodate complicated response 
models will be left for future research. 
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Table 1 A List of 16 Ideal Item Response Patterns obtained from 
16 Attribute Response Patterns by a Boolean Description 
Function 

Attribute response patterns Ideal item response patterns 



1 


0000 


1000000000000000 


2 


1000 


1100000000000000 


3 


0100 


1010000000000000 


4 


0010 


1001000000000000 


5 


0001 


1000100000000000 


6 


1100 


1110010000000000 


7 


1010 


1101001000000000 


8 


1001 


1100100100000000 


9 


0110 


1011000010000000 


10 


0101 


1010100001000000 


11 


0011 


1001100000100000 


12 


1110 


1111011010010000 


13 


1101 


1110110101001000 


14 


1011 


1101101100100100 


15 


0111 


1011100011100010 


16 


1111 


1111111111111111 



■17 



Table 2 A List of Attribute Response Patterns and Ideal Item 
Response Patterns Affected by Direct Relations of 
Attributes 



Original Patterns 

1,3,4,5,9, 10,11, 15 

2, 7 
8, 14 
13 
6 
12 
16 



Attribute 
Patterns 

0000 

1000 
1001 
1101 
1100 
1110 
1111 



Ideal Item Patterns 

1000000000000000 

1100000000000000 
1100100100000000 
1110110101001000 
1110010000000000 
1111011010010000 

1111111111111111 
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Table 3 A List of Nine Knowledge and Capability States and Nine 
Ideal Item Patterns of GRE-math items 



Attribute Patterns Ideal Item Patterns Description of States 



1 1111 11111 can do everything 



2 1110 01101 Can do A^*, Ag, A- 



* 



3 



Cannot do A^ 



3 1100 01100 Can do A^ , Ag 

Cannot do A3, A^ 

4 1011 00111 Can do A^ , A3, A^ 

Cannot do Aj 

5 1010 00101 Can do A^ , A3 

Cannot do Aj, A^ 

6 1000 00100 Can do A^ 

Cannot do Ag, A3, A^. 

7 0011 00011 Can do A3, A^ 

Cannot do A,, Aj 

8 0010 00001 Can do A3 

Cannot do A^, Ag, A^ 

9 0000 00000 Cannot do anything 



A^ : Goal is to find the net filling rate 

Ag : Compute the rate 

A3 : Goal is to find the time to fill the tank 

A^ : Compute the time. 




Table 4 Ten Items with Their Attribute Characteristics 
by Method A and Method B 



Method A 



1 


2 8/6 


+ 


3 10/6 


2 


3/5 


+ 


1/5 


3 


3 10/4 


+ 


4 6/4 


4 


7/4 


+ 


5/4 


5 


3/4 


+ 


1/2 


6 


2/5 


+ 


12/8 


7 


1/2 


+ 


1 10/7 


8 


1/3 


+ 


1/2 


9 


3 1/6 


+ 


2 3/4 


10 


5/6 


+ 


1/3 


1 


2 8/6 


+ 


3 10/6 


2 


3/5 


+ 


1/5 


3 


3 10/4 


+ 


4 6/4 


4 


7/4 


+ 


5/4 


5 


3/4 


+ 


1/2 


6 


2/5 


+ 


12/8 


7 


1/2 


+ 


1 10/7 


8 


1/3 


+ 


1/2 


9 


3 1/6 


+ 


2 3/4 


10 


5/6 


+ 


1/3 



Method 



A. . 


A^. 


A*. 


A, . 

"6' 


A^ 




\ 














A2 f 


A3. 


A6. 


A7 






A7 










\' 


A5. 


A6. 


A7. 


As 




A3. 


A,, 


A5. 


A6. 


A^, 


As 


A2, 


\^ 


A5. 


A6. 


A7. 


As 


A., 


A5. 


A6 








Av 




A,. 


A5. 


A6. 


A7. 


A,, 


A5. 


A6. 


A7. 


As 






A3. 


A,. 


A5. 


A6. 




same as 


by Method A 






A3. 


^6' 


A7. 


As. 


B2 



same as by Method A 
same as by Method A 
same as by Method A 
Bi A^, A5, A^, Aj, Ag, B2 
same as by Method A 
B-i f A^ , Ag , A^ , B2 
same as by Method A 



Table 5 A list of all the possible sets of attribute patterns 
derived from the incidence matrices given in Table 4 

Method A 

States Cannot Can Ideal Item Response Pattern 



X 


none 


1,2,3, 


4,5,6, 


7,8 


1111111111 




o 
o 


1,2,3, 


4,5,6, 


7 


1111000100 


J 


/IRQ 

4,5,8 


1,2,3, 


6,7 




1111000000 


4 


1 


2,3 4, 


5,6,7, 


8 


0101111101 


5 


2,1 


3,4,5, 


6,7,8 




0101110101 


6 


3 


1,2,4, 


5,6,7, 


8 


0101101111 


7 


3,1 


2,4,5, 


6,7,8 




0101101101 


8 


3,2,1 


4,5,6, 


7,8 




0101100101 


9 


1,2,3,8 


4,5,6, 


7 




0101000100 


10 


1,2,3,4,5,8 


6,7 






0101000000 


11 


7,1,2,3,8 


4,5,6 






0100000100 


12 


1,2,3,8,7,4,5 


6 






0100000000 


13 


1,2,3,4,5,6,7,8 


none 






0000000000 



Method B 



States Cannot 


Can 










1 


none 


3,4,5, 


6, 


7,8, 


9, 10 


1111111111 


2 


8 


3,4,5, 


6, 


7,9, 


10 


1101000110 


3 


4,5 


3,6,7, 


8, 


9, 10 




0111000000 


4 


9, 10 


3,4,5, 


6, 


7,8 




0101110101 


5 


3 


4,5,6, 


7, 


8,9, 


10 


0101101111 


6 


3, 9, 10 


4,5,6, 


7, 


8 




0101100101 


7 


3,8 


4,5,6, 


7, 


9,10 




0101000110 


8 


3,8,9, 10 


4,5,6, 


7 






0101000100 


9 


3,4,5,8,9, 10 


6,7 








0101000000 


10 


7,3 8 


4,5,6, 


9, 


10 




0100000110 


11 


3,7,8,9,10 


4,5,6 








0100000100 


12 


3,4,5,7,8,9,10 


6 








0100000000 


13 


3,4,5,6,7,8,9,10 


none 








0000000000 
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Table 6 The Ten States Selected from One-hundred Fifty-seven 

Possible States Yielded by Boolean Operation (via BUGLIB 
program) 

States Attribute Pattern Directed Direct Relation 

Among Attributes 

1111111111222 
1234567890123456789012 



1 

A, 


IN O • 


1 

X 


1111111111111111111111 


None 






2 


No. 


4 


1111111111111111110111 


None 










1 1 


llllllllllllJ. llllUUlll 


^18 


-> Ai9 






A 


IN (J • 


A, Cm 


iiiiniiiiiiiiiiiinniii 


Al8 


-> Ai,. 






5 


No. 


14 


1111011110111111100111 


^18 


-> Ai9 






6 


No. 


30 


1111011100111111100111 


A9 


"> A,Q, 


A18 -> 


Al9 


7 


No. 


32 


1100011100111111100110 


A3 


-> A, - 


> A5 / A9 


-> A 


8 


No. 


102 


1000011111111111111111 


A2 


-> A3 - 


> A4 -> 


A5 


9 


No. 


138 


1000011111111011110111 


A2 


-> A3 - 


> A4 -> 


A5 


10 


No. 


156 


1000 010000001110000100 


\ 


-> A3 - 


> A4 -> 


A5 










A7 


-> Ag - 


> A9 -> 


A10 










A11 


-> A12 


-> Ai3 












A16 


-> Ai7 


-> A18 


-> A^9 










A21 


-> A22 







10 



systematic analysis of 



task 



skill 



job 



content 



identifying prime components, abstracting attributes 
and naming them A^, , A,^. 






0 




Figure 1 Examples of Attributes 



')3 



-9 



I 



16 



121 



■to 



23 



9 i- 



/ 



1 



11 



I ( 



-3 



Figure 3 The Rule Space Configuration. 

The Numbers in Nine ellipses indicate error States (e.g., No. 5 State is 
"one cannot do the operation of borrowing in fraction subtraction problems.") 
and X marks represent students' points (9 ,0 ■ 
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THtS IS TMC NUM 
or TMt WtSULT 



COPY C.THIS IS TMT. 

orwo or tmc result 



DiVlOe HUM 
BY OCHO 



IS 

^OtNO 1 



OOH T rORCtT 



IS 

rO«t THC 
. fHACT»ON ^ 



OlVJOt TRACTION 

BY cr 



YOU use 
UCTHOO 



oon't rowcrr 

TMt WNF 



Figure 4 Task Specification Chart for Fraction Addition and 
Subtraction Problems. 

Symbol used to denote the general fraction form used in 
this figure is: a(b/c) + d(e/f); F is fraction; CD is common 
denominator; CF is common factor; WNP is whole number part; NUM 
Q is numerator; DENO is denominator; EF is equivalent fraction. 
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